keyword extraction
This word is used when we are very conscious of not stopping at one word.
I'd like to see "100 Personnel Systems for 100 People" or "Evaluating by Harmony, Generalists are Chosen."
Even the world's "key phrase extraction" is often subject to restrictions such as "noun sequence" or "adjective*noun+ form"
It's not possible to extract key phrases like the above with such a constrained method.
Often there are times when you want to use a string of characters that do not appear in a sentence as a key phrase.
I want them to be connected by an "information sharing" link when the phrases "information sharing" and "sharing information" are used.
technique
An approach that does not use linguistic knowledge
Need for [stopword
Throwing away information on word order.
The "General Manager's Association" issue where the idiom is split.
Synonyms are considered different
co-location
N-grams, etc.
intra-window co-occurrence
intra-document cooccurrence
Approach to map real-valued scores whereas the stop word was 0/1
'The less frequently it appears in other texts, the more appropriate it is to characterize this text.'
Frequent occurrence as a word, but sometimes an important key phrase in the form of an idiom
graph based (e.g. graph)
Graph word adjacencies and choose the one with the highest rank.
Use PageRank
---
This page is auto-translated from /nishio/キーワード抽出. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.